Optimization of Text Classification Using Supervised and Unsupervised Learning Approach

نویسندگان

  • Manpreet Kaur
  • Vijay Kumar
چکیده

Text Classification, also known as text categorization, is the task of automatically allocating unlabeled documents into predefined categories. Text Classification means allocating a document to one or more categories or classes. The ability to accurately perform a classification task depends on the representations of documents to be classified. Text representations transform the textural documents into a compact format. Text Classification plays an important role in information mining, summarization, text recovery and question-answering. It uses several tools from information retrieval (IR) and Machine Learning. Here we are reviewing the effectiveness of different supervised and unsupervised learning approaches in text classification. KeywordsText Mining, Text Classification, Feature Extraction,Term Weighting, Linear SVC, SGD,K-Mean with cosine similarity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

طبقه بندی و شناسایی رخساره‌های زمین‌شناسی با استفاده از داده‌های لرزه نگاری و شبکه‌های عصبی رقابتی

Geological facies interpretation is essential for reservoir studying. The method of classification and identification seismic traces is a powerful approach for geological facies classification and distinction. Use of neural networks as classifiers is increasing in different sciences like seismic. They are computer efficient and ideal for patterns identification. They can simply learn new algori...

متن کامل

Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk

This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...

متن کامل

Optimizing Unsupervised Learning of Opinion Targets from Unstructured Reviews Using Wordnet Based Semantic Orientation

Opinion Target identification is an important task of opinion mining problem. Several approaches have been employed for this task which can be broadly divided into two major categories: supervised and unsupervised. The supervised approaches require training data which need manual work and are mostly domain dependent. Unsupervised technique is most popularly used due to its two main advantages: ...

متن کامل

Compressive Feature Learning

This paper addresses the problem of unsupervised feature learning for text data. Our method is grounded in the principle of minimum description length and uses a dictionary-based compression scheme to extract a succinct feature set. Specifically, our method finds a set of word k-grams that minimizes the cost of reconstructing the text losslessly. We formulate document compression as a binary op...

متن کامل

Evaluating the Effectiveness of Supervised and Unsupervised Classification Methods in Monitoring Regs (Case Study: Jazmourian Reg)

Due to its mobility and ability to move and its direct impact on residential areas and various developmental activities, the Ergs are of major importance in the desert areas, so monitoring of those is very important. Considering that the use of supervised and unguarded methods is considered as one of the most common methods in determining and monitoring land uses, in this research, the accuracy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015